Structural Properties as Proxy for Semantic Relevance in RDF Graph Sampling
نویسندگان
چکیده
The Linked Data cloud has grown to become the largest knowledge base ever constructed. Its size is now turning into a major bottleneck for many applications. In order to facilitate access to this structured information, this paper proposes an automatic sampling method targeted at maximizing answer coverage for applications using SPARQL querying. The approach presented in this paper is novel: no similar RDF sampling approach exist. Additionally, the concept of creating a sample aimed at maximizing SPARQL answer coverage, is unique. We empirically show that the relevance of triples for sampling (a semantic notion) is in uenced by the topology of the graph (purely structural), and can be determined without prior knowledge of the queries. Experiments show a signi cantly higher recall of topology based sampling methods over random and naive baseline approaches (e.g. up to 90% for Open-BioMed at a sample size of 6%).
منابع مشابه
ASSG: Adaptive structural summary for RDF graph data
RDF is considered to be an important data model for Semantic Web as a labeled directed graph. Querying in massive RDF graph data is known to be hard. In order to reduce the data size, we present ASSG, an Adaptive Structural Summary for RDF Graph data by bisimulations between nodes. ASSG compresses only the part of the graph related to queries. Thus ASSG contains less nodes and edges than existi...
متن کاملTracking RDF Graph Provenance using RDF Molecules
The Semantic Web facilitates integrating partial knowledge and finding evidence for hypothesis from web knowledge sources. However, the appropriate level of granularity for tracking provenance of RDF graph remains in debate. RDF document is too coarse since it could contain irrelevant information. RDF triple will fail when two triples share the same blank node. Therefore, this paper investigate...
متن کاملUsing RDF Summary Graph For Keyword-based Semantic Searches
The Semantic Web began to emerge as its standards and technologies developed rapidly in the recent years. The continuing development of Semantic Web technologies has facilitated publishing explicit semantics with data on the Web in RDF data model. This study proposes a semantic search framework to support efficient keyword-based semantic search on RDF data utilizing near neighbor explorations. ...
متن کاملAn Improved Semantic Schema Matching Approach
Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...
متن کاملAdaptive Rdf Graph Replication for Mobile Semantic Web Applications
An increasing number of applications are based on Semantic Web technologies and the amount of information available on the Web in the form of RDF is continuously growing. The adaption of the Semantic Web for Personal Information Management and the increasing desire for mobility is often accompanied by situations where no network connectivity is available and hence access to remote data is limit...
متن کامل